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Abstract. We review mathematically tractable models for connected 
networks on random points in the plane, emphasizing the class of prox- 
imity graphs which deserves to be better known to applied probabilists 
and statisticians. We introduce and motivate a particular statistic R 
measuring shortness of routes in a network. We illustrate, via Monte 
Carlo in part, the trade-off between normalized network length and R 
in a one-parameter family of proximity graphs. How close this family 
comes to the optimal trade-off over all possible networks remains an 
intriguing open question. 

The paper is a write-up of a talk developed by the first author during 
2007-2009. 
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1. INTRODUCTION 

The topic called random networks or complex net- 
works has attracted huge attention over the last 20 
years. Much of this work focuses on examples such 
as social networks or WWW links, in which edges 
are not closely constrained by two-dimensional ge- 
ometry. In contrast, in a spatial network not only 
are vertices and edges situated in two-dimensional 
space, but also it is actual distances, rather than 
number of edges, that are of interest. To be concrete, 
we visualize idealized inter-city road networks, and 
a feature of interest is the (minimum) route length 
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between two given cities. Because we work only in 
two dimensions, the word spatial may be mislead- 
ing, but equally the word planar would be mislead- 
ing because we do not require networks to be planar 
graphs (if edges cross, then a junction is created). 

Our major purpose is to draw the attention of 
readers from the applied probability and statistics 
communities to a particular class of spatial network 
models. Recall that the most studied network model, 
the random geometric graph [40] reviewed in Section 
2.1, does not permit both connectivity and bounded 
normalized length in the n — > oo limit. An attractive 
alternative is the class of proximity graphs, reviewed 
in Section 2.3, which in the deterministic case have 
been studied within computational geometry. These 
graphs are always connected. Proximity graphs on 
random points have been studied in only a few pa- 
pers, but are potentially interesting for many pur- 
poses other than the specific "short route lengths" 
topic of this paper (see Section 6.5). One could also 
imagine constructions which depend on points hav- 
ing specifically the Poisson point process distribu- 
tion, and one novel such network, which we name 
the Hammersley network, is described in Section 2.5. 

Visualizing idealized road networks, it is natu- 
ral to take total network length as the "cost" of a 
network, but what is the corresponding "benefit"? 
Primarily we are interested in having short route 
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lengths. Choosing an appropriate statistic to mea- 
sure the latter turns out to be rather subtle, and 
the (only) technical innovation of this paper is the 
introduction (Section 3.2) and motivation of a spe- 
cific statistic R for measuring the effectiveness of a 
network in providing short routes. 

In the theory of spatial networks over random 
points, it is a challenge to quantify the trade-off 
between network length [precisely, the normalized 
length L defined at (2)] and route length efficiency 
statistics such as R. Our particular statistic R is not 
amenable to explicit calculation even in compara- 
tively tractable models, but in Section 4 we present 
the results from Monte Carlo simulations. In partic- 
ular, Figure 7 shows the trade-off for the particular 
f3 -skeleton family of proximity graphs. 

Given a normalized network length L, for any real- 
ization of cities there is some network of normalized 
length L which minimizes R. As indicated in Sec- 
tion 5, by general abstract mathematical arguments, 
there must exist a deterministic function R opt (L) 
giving (in the "number of cities — > oo" limit under 
the random model) the minimum value of R over 
all possible networks of normalized length L. An in- 
triguing open question is as follows: 

how close are the values i?g- s k e i(L) from 
the /3-skeleton proximity graphs to the op- 
timum values -R p t (L)? 

As discussed in Section 5.3, at first sight it looks easy 
to design heuristic algorithms for networks which 
should improve over the /3-skeletons, for example, 
by introducing Steiner points, but in practice we 
have not succeeded in doing so. 

This paper focuses on the random model for city 
positions because it seems the natural setting for 
theoretical study. As a complement, in [10] we give 
empirical data for the values of (L,R) for certain 
real-world networks (on the 20 largest cities, in each 
of 10 US States). In [8] we give analytic results and 
bounds on the trade-off between L and the mathe- 
matically more tractable stretch statistic R max at 
(4), in both worst-case and random-case settings 
for city positions. Let us also point out a (perhaps) 
nonobvious insight discussed in Section 3.3: in de- 
signing networks to be efficient in the sense of pro- 
viding short routes, the main difficulty is providing 
short routes between city-pairs at a specific distance 
(2-3 standardized units) apart, rather than between 
pairs at a large distance apart. 



Finally, recall this is a nontechnical account. Our 
purpose is to elaborate verbally the ideas outlined 
above; some technical aspects will be pursued else- 
where. 

2. MODELS FOR CONNECTED SPATIAL 
NETWORKS 

There are several conceptually different ways of 
defining networks on random points in the plane. To 
be concrete, we call the points cities; to be consis- 
tent about language, we regard xi as the position of 
city i and represent network edges as line segments 

First (Sections 2.1-2.3) are schemes which use de- 
terministic rules to define edges for an arbitrary de- 
terministic configuration of cities; then one just ap- 
plies these rules to a random configuration. Second, 
one can have random rules for edges in a determin- 
istic configuration (e.g., the probability of an edge 
between cities i and j is a function of Euclidean 
distance d(xi,Xj), as in popular small worlds mod- 
els [39]), and again apply to a random configura- 
tion. Third, and more subtly, one can have construc- 
tions that depend on the randomness model for city 
positions — Section 2.5 provides a novel example. 

We work throughout with reference to Euclidean 
distance d(x,y) on the plane, even though many 
models could be defined with reference to other met- 
rics (or even when the triangle inequality does not 
hold, for the MST). 

2.1 The Geometric Graph 

In Sections 2.1-2.3 we have an arbitrary configura- 
tion x = {xi} of city positions, and a deterministic 
rule for defining the edge-set £. Usually in graph 
theory one imagines a finite configuration, but note 
that everything makes sense for locally finite con- 
figurations too. Where helpful, we assume "general 
position," so that intercity distances d(xi,xj) are all 
distinct. 

For the geometric graph one fixes < c < oo and 
defines 

(xi,Xj) £ £ iff d(xi,Xj) < c. 

For the K -neighbor graph one fixes K > 1 and de- 
fines 

(xi ,Xj) € £ iff Xi is one of the K closest 
neighbors of Xj, or Xj is one of the K clos- 
est neighbors of Xj. 



CONNECTED NETWORKS OVER RANDOM POINTS 



3 



A moment's thought shows these graphs are in gen- 
eral not connected, so we turn to models which are 
"by construction" connected. We remark that the 
connectivity threshold c n in the finite n- vertex model 
of the random geometric graph has been studied in 
detail — see Chapter 13 of [40]. 

2.2 A Nested Sequence of Connected Graphs 

The material here and in the next section was de- 
veloped in graph theory with a view toward algo- 
rithmic applications in computational geometry and 
pattern recognition. The 1992 survey [28] gives the 
history of the subject and 116 citations. But every- 
thing we need is immediate from the (careful choice 
of) definitions. On our arbitrary configuration x we 
can define four graphs whose edge-sets are nested as 
follows: 

MST C relative n'hood C Gabriel C Delaunay. 

(1) 

Here are the definitions (for MST and Delaunay, it 
is easy to check these are equivalent to more familiar 
definitions). In each case, we write the criterion for 
an edge (xi,Xj) to be present: 

• Minimum spanning tree (MST) [24]. There does 
not exist a sequence i = ko , k± , . . . , k m = j of cities 
such that 

max(d(x ko ,x kl ), d(x kl , x k2 ) , . . . , d(x km _ 1 ,x km }) 
< d(xi,Xj). 

• Relative neighborhood graph. There does not exist 
a city k such that 

max(d(xi,x k ),d(x k ,Xj)) < d(x i: Xj). 




• Gabriel graph. There does not exist a city inside 
the disc whose diameter is the line segment from 
x 2 to ffij • 

• Delaunay triangulation [23]. There exists some 
disc, with Xi and Xj on its boundary, so that no 
city is inside the disc. 

The inclusions (1) are immediate from these defini- 
tions. Because the MST (for a finite configuration) 
is connected, all these graphs are connected. 

Figure 1 illustrates the relative neighborhood and 
Gabriel graphs. Figures for the MST and the Delau- 
nay triangulation can be found online at http://www. 
spss.com/research/wilkinson /Applets / edges.html. 

Constructions such as the relative neighborhood 
and Gabriel graphs have become known loosely as 
proximity graphs in [28] and subsequent literature, 
and we next take the opportunity to turn an implicit 
definition in the literature into an explicit definition. 

2.3 Proximity Graphs 

Write V- and v + for the points (— 1,0) and (|,0). 
The lune is the intersection of the open discs of 
radii 1 centered at V- and v+. So u_ and v+ are 
not in the lune but are on its boundary. Define a 
template A to be a subset of M? such that: 

(i) A is a subset of the lune. 

(ii) A contains the open line segment (v-,v+). 

(iii) A is invariant under the "reflection in the y- 
axis" map Reflect x (xi, X2) = (—Xi,X2) and the "re- 
flection in the x-axis" map Reflect,, (xi, X2) = (x±, 
-x 2 )- 

(iv) A is open. 

For arbitrary points x,y in M 2 , define A(x,y) to 
be the image of A under the natural transforma- 




FlG. 1. The relative neighborhood graph (left) and Gabriel graph (right) on different realizations of 500 random points. 
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tion (translation, rotation and scaling) that takes 
(v-,v+) to (x,y). 

Definition. Given a template A and a locally 
finite set V of vertices, the associated proximity graph 
G has edges defined by, for each x,y € V, 

(x,y) is an edge of G iff A(x,y) contains 
no vertex of V. 

From the definitions: 

• if A is the lune, then G is the relative neighbor- 
hood graph; 

• if A is the disc centered at the origin with radius 
1/2, then G is the Gabriel graph. 

But the MST and Delaunay triangulation are not 
instances of proximity graphs. 

Note that replacing A by a subset A' can only 
introduce extra edges. It follows from (1) that the 
proximity graph is always connected. The Gabriel 
graph is planar. But if A is not a superset of the disc 
centered at the origin with radius 1/2, then G might 
not be a subgraph of the Delaunay triangulation, 
and in this case edges may cross, so G is not planar 
(e.g., if the vertex-set is the four corners of a square, 
then the diagonals would be edges). 

For a given configuration x, there is a collection of 
proximity graphs indexed by the template A, so by 
choosing a monotone one-parameter family of tem- 
plates, one gets a monotone one-parameter family 
of graphs, analogous to the one-parameter family 
Q c of geometric graphs. Here is a popular choice [30] 
in which f3 = 1 gives the Gabriel graph and (3 = 2 
gives the relative neighborhood graph. 

Definition (The (3 -skeleton family), (i) For < 
f3 < 1 let Ap be the intersection of the two open discs 
of radius (2/3) -1 passing through v_ and v+. 

(ii) For 1 < /3 < 2 let Ap be the intersection of 
the two open discs of radius /3/2 centered at (±(/3 — 
l)/2,0). 

2.4 Networks Based on Powers of Edge-Lengths 

It is not hard to think of other ways to define one- 
parameter families of networks. Here is one scheme 
used in, for example, [38]. Fix 1 < p < oo. Given a 
configuration x, and a route (sequence of vertices) 
Xq, xi, ■ ■ . ,Xk, say, the cost of the route is the sum of 
pth powers of the step lengths. Now say that a pair 
(x, y) is an edge of the network Q v if the cheapest 
route from x to y is the one-step route. As p in- 
creases from 1 to oo, these networks decrease from 
the complete graph to the MST. Moreover, for p > 2 
the network Q v is a subgraph of the Gabriel graph. 



2.5 The Hammersley Network 

There is a quite separate recent literature in the- 
oretical probability [26, 27] defining structures such 
as trees and matchings directly on the infinite Pois- 
son point process. In this spirit, we observe that the 
Hammersley process studied in [6] can be used to 
define a new network on the infinite Poisson point 
process, which we name the Hammersley network. 
This network is designed to have the feature that 
each vertex has exactly 4 edges, in directions NE 
(between North and East), NW, SE and SW. The 
conceptual difference from the networks in the previ- 
ous section is that there is not such a simple "local" 
criterion for whether a potential edge (xi,Xj) is in 
the network. And edges cross, creating junctions. 

For a picturesque description, imagine one-eyed 
frogs sitting on an infinitely long, thin log, each be- 
ing able to see only the part of the log to their left 
before the next frog. At random times and positions 
(precisely, as a space-time Poisson point process of 
rate 1) a fly lands on the log, at which instant the 
(unique) frog which can see it jumps left to the fly's 
position and eats it. This defines a continuous time 
Markov process (the Hammersley process) whose 
states are the configurations of positions of all the 
frogs. There is a stationary version of the process in 
which, at each time, the positions of the frogs form 
a Poisson (rate 1) point process on the line. 

Now consider the space-time trajectories of all the 
frogs, drawn with time increasing upward on the 
page. See Figure 2. For each frog, the part of the 
trajectory between the completions of two successive 
jumps consists of an upward edge (the frog remains 
in place as time increases) followed by a leftward 
edge (the frog jumps left). 

Reinterpreting the time second space 

axis, and introducing compass directions, that part 
of the trajectory becomes a North edge followed by 
a West edge. Now replace these two edges by a single 
North- West straight edge. Doing this procedure for 
each frog and each pair of successive jumps, we ob- 
tain a collection of NW paths, that is, a network in 
which each city (the reinterpreted space-time ran- 
dom points) has an edge to the NW and an edge 
to the SE. Finally, we repeat the construction with 
the same realization of the space-time Poisson point 
process but with frogs jumping rightward instead of 
leftward. This yields a network on the infinite Pois- 
son point process, which we name the Hammersley 
network. See Figure 3. 
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Remarks, (a) To draw the Hammersley network 
on random points in a finite square, one needs ex- 
ternal randomization to give the initial (time 0) frog 
positions, in fact, two independent randomizations 
for the leftward and the rightward processes. So to 
be pedantic, one gets a random network over the 
given realization of cities. However, one can deduce 
from the theoretical results in [6] that the external 




Fig. 2. Space-time trajectories in Hammersley 's process. 
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Fig. 3. T/ie Hammersley network on 2500 random points. 



randomization has effect only near the boundary of 
the square. 

(b) The property that each vertex has exactly 
4 edges, in directions NE (between North and East), 
NW, SE and SW, is immediate from the construc- 
tion. Note, however, that while adjacent NW space- 
time trajectories in Figure 2 do not cross, the corre- 
sponding diagonal roads in the Hammersley network 
may cross, so it is not a planar graph, though this 
has only negligible effect on route lengths. 

(c) Intuition, confirmed by Figure 7 later, says 
that the Hammersley network is not very efficient as 
a road network. It serves to demonstrate that there 
do exist random networks other than the familiar 
ones, and provides an instance where imposing de- 
terministic constraints (the four edges, in this case) 
on a random network makes it much less efficient. 
How general a phenomenon is this? 

2.6 Normalized Length 

The notion of normalized network length L is most 
easily visualized in the setting of an infinite deter- 
ministic network which is "regular" in the sense of 
consisting of a repeated pattern. First choose the 
unit of length so that cities have an average density 
of one per unit area. Then define 

(2) L = average network length per unit area, 

A = average degree (number of incident edges) 

(3) 

of cities. 

Figure 4 shows the values of L and A for some 
simple "repeated pattern" networks. Though not di- 
rectly relevant to our study of the random model, we 
find Figure 4 helpful for two reasons: as intuition for 
the interpretation of the different numerical values 
of L, and because we can make very loose analogies 
(Section 6.6) between particular networks on ran- 
dom points and particular deterministic networks. 

3. NORMALIZED LENGTH AND 
ROUTE-LENGTH EFFICIENCY 

3.1 The Random Model 

For the remainder of the paper we work with "the 
random model" for city positions. The finite model 
assumes n random vertices (cities) distributed inde- 
pendently and uniformly in a square of area n. The 
infinite model assumes the Poisson point process of 
rate 1 (per unit area) in the plane. The quantities 
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L=1.25 A = 2.5 
Punctured lattice 



1.32 A 



L=1.50 A 




L = IM A = 3 L = 2.00 A = 4 L = 2.71 A = 5 

Square lattice 




L = 2.83 A = 4 L = 3.22 A = 6 L = 3.41 A = 6 

Diagonal lattice Triangular lattice 



Fig. 4. Variant square, triangular and hexagonal lattices. Drawn so that the density of cities is the same in each diagram, 
and ordered by value of L. 



L, A above and R below that we discuss may be in- 
terpreted as exact values in the infinite model or as 
n — > oo limits in the finite model; see Section 5. We 
use the word normalized as a reminder of the "den- 
sity 1" convention — we choose the normalized unit 
of distance to make cities have average density 1 per 
unit area. After this normalization, L is the average 
network length per unit area. 

3.2 The Route-Length Efficiency Statistic R 

In designing a network, it is natural to regard total 
length as a "cost". The corresponding "benefit" is 
having short routes between cities. Write t(i,j) for 
the route length (length of shortest path) between 
cities i and j in a given network, and d(i,j) for 



Euclidean distance between the cities. So > 
d(i,j), and we write 

r (hJ) = -77—. T - 1 

so that ll r(i,j) = 0.2" means that route length is 
20% longer than straight line distance. With n cities 
we get (2) such numbers r(i,j); what is a reasonable 
way to combine these into a single statistic? Two 
natural possibilities are as follows: 

i? max := maxr(i,j), 

(4) 

.R aV e :=ave (ij )r(z,j), 
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Fig. 5. Efficient or inefficient? i? a vc would judge this net- 
work efficient in the n — > oo limit. 

where ave^j) denotes average over all distinct pairs 
(i,j). The statistic i? max has been studied in the 
context of the design of geometric spanner networks 
[37] where it is called the stretch. However, being 
an "extremal" statistic i? max seems unsatisfactory 
as a descriptor of real world networks — for instance, 
it seems unreasonable to characterize the UK rail 
network as inefficient simply because there is no very 
direct route between Oxford and Cambridge. 

The statistic i? ave has a more subtle drawback. 
Consider a network consisting of: 

• the minimum-length connected network (Steiner 
tree) on given cities; 

• and a superimposed sparse collection of randomly 
oriented lines (a Poisson line process [45]). 

See Figure 5. By choosing the density of lines to 
be sufficiently low, one can make the normalized 
network length be arbitrarily close to the minimum 
needed for connectivity. But it is easy to show (see 
[7] for careful analysis and a stronger result) that 
one can construct such networks so that R avc — > as 
n — > oo. Of course no one would build a road network 
looking like Figure 5 to link cities, because there are 
many pairs of nearby cities with only very indirect 
routes between them. The disadvantage of i? ave as a 
descriptive statistic is that (for large n) most city- 
pairs are far apart, so the fact that a given network 



has a small value of i? ave says nothing about route 
lengths between nearby cities. 

We propose a statistic R which is intermediate be- 
tween i? ave and -Rmax- First consider (see discussion 
below for details) 

p(d) := mean value of r(i,j) over 
city-pairs with d(i,j) = d 

and then define 

(5) R:= max p(d). 

0<ci<oo 

In words, R = 0.2 means that on every scale of dis- 
tance, route lengths are on average at most 20% 
longer than straight line distance. 

On an intuitive level, R provides a sensible and 
interpretable way to compare efficiency of different 
networks in providing short routes. On a technical 
level, we see two advantages and one disadvantage 
of using R instead of i? a ve- 

Advantage 1. Using R to measure efficiency, there 
is a meaningful n — > oo limit for the network length / 
efficiency trade-off [the function R opt (L) discussed 
in Section 5], and so, in particular, it makes sense 
to compare the values of R for networks with differ- 
ent n. 

Advantage 2. A more realistic model for traffic 
would posit that volume of traffic between two cities 
varies as a power-law d~~ ( of distance d, so that in 
calculating i? avc it would be more realistic to weight 
by d~ 7 . This means that the optimal network, when 
using i? ave as optimality criterion, would depend on 
7. Use of R finesses this issue; the value of 7 does 
not affect R. A related issue is that volume of traffic 
between two cities should depend on their popula- 
tions. Intuitively, incorporating random population 
sizes should make the optimal R smaller because the 
network designer can create shorter routes between 
larger cities. We see this effect in data [10]; R calcu- 
lated via population-weighting is typically slightly 
smaller. But we have not tried theoretical study. 

Disadvantage. The statistic R is tailored to the in- 
finite model, in which it makes sense to consider two 
cities at exactly distance d apart (then the other city 
positions form a Poisson point process). For finite n 
we need to discretize. For the empirical data in [10], 
where n = 20, we average over intervals of width 1 
unit (recall the unit of distance is taken such that 
the density of cities is 1 per unit ), that is, for 
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d = 1, 2, . . . , 5, we calculate 

p(d) := mean value of r(i,j) over city-pairs 
(6) with d- \ <d(i,j) <d+ ~, 

R := max 5(d) 

l<d<oo 

and use R as proxy for 72. For larger n we can use 
shorter intervals. Thus, there is, in principle, a cer- 
tain fuzziness to the notion of R for finite networks, 
and, in particular, it is not clear how to assign a 
value of R to regular networks such as those in Fig- 
ure 4. But in practice, for networks we have studied 
on real-world data and on random points, this is not 
a problem, as explained next. 

3.3 Characteristic Shape of the Function p(d) 

For the connected networks on random points (ex- 
cluding the Hammersley network) we are discussing, 
the function p{d) has a characteristic shape (see 
Figure 6) attaining its maximum between 2 and 3 
and slowly decreasing thereafter. We suspect that 
"this characteristic shape holds for any reasonable 
model," but we do not know how to turn that phrase 
into a precise conjecture. Note that "smoothness 
near the maximum" implies that any calculated value 
R at (6) is quite insensitive to the choice of dis- 
cretization. 

This characteristic shape has a common-sense in- 
terpretation. Any efficient network will tend to place 
roads directly between unusually close city-pairs, 
implying that p{d) should be small for d < 1. For 
large d the presence of multiple alternate routes 
helps prevent p(d) from growing. At distance 2 — 3 
from a typical city i there will be about 7r3 2 — 7r2 2 ~ 
16 other cities j. For some of these j there will be 
cities k near the straight line from i to j, so the 
network designer can create roads from i to k to j. 
The difficulty arises where there is no such inter- 
mediate city k: including a direct road (xi,Xj) will 
increase L, but not including it will increase p(d) for 
2<d<3. 

Thus, Figure 6 offers a minor insight into spa- 
tial network design: that it is city pairs at normal- 
ized distance 2 — 3 specifically that enforce the con- 
straints on efficient network design. 

The characteristic shape — at least, the flatness over 
2 < d < 5 — is also visible in the real- world data [10]. 

For the Hammersley network, the graph of p(d) is 
quite different; p{d) increases to a maximum of 0.35 
around d = 0.8 and then decreases more steeply to 



a value of 0.21 at d = 5. This arises from the partic- 
ular structure (from each city there is one road in 
each quadrant) resembling the deterministic "diag- 
onal lattice" of Figure 4, in which the route between 
some nearby pairs will be via two diagonal roads and 
a junction. 

4. LENGTH-EFFICIENCY TRADE-OFF FOR 
TRACTABLE NETWORKS 

Recall that our overall theme is the trade-off be- 
tween network length and route-length efficiency, 
and that in this paper we focus on n — > oo limits 
in the random model and the particular statistics L 
and R. 

The models described in Section 2 are "tractable" 
in the specific sense that one can find exact analytic 
formulas for normalized length L. Unfortunately R 
is not amenable to analytic calculation, and we re- 
sort to Monte Carlo simulation to obtain values for 
R. Table 1 and Figure 7 show the values of (L, R) 
in the models. We explain below how the values of 
L are calculated. 

Notes on Table 1. (a) Values of R from our simu- 
lations with n = 2500. 

(b) Value of L for MST from Monte Carlo [19]. In 
principle, one can calculate arbitrarily close bounds 
[11], but apparently this has never been carried out. 
Of course, A = 2 for any tree. 

(c) The Gabriel graph and the relative neighbor- 
hood graph fit the assumptions of Lemma 1 with 
c = 7r/4 and c = ^ — , respectively, and their ta- 
ble entries for L and A are obtained from Lemma 
1, as are the values for /3-skeletons in Figure 7. 

(d) For the Hammersley network, every degree 
equals 4, so L = 2 x (mean edge- length). It follows 
from theory [6] that a typical edge, say, NE from 
(x, y), goes to a city at position (x + £r, y + £, y ), where 

Table 1 



Statistics of tractable networks on 


random points 




Network 


L 


A 


R 


Minimum spanning tree 


0.633 


2 


oo 


Relative n'hood 


1.02 


2.56 


0.38 


Gabriel 


2 


4 


0.15 


Hammersley 


3.25 


4 


0.35 


Delaunay 


3.40 


6 


0.07 


Notes: Integer values are exact. Recall L 
(2), A is average degree (3) and R is our 


is normalized length 
route-length statistic 



(5). 
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Relative n'hood 
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Gabriel 



Delaunay 



2 3 4 5 

Normalized distance d 

Fig. 6. The function p(d) for three theoretical networks on random cities. Irregularities are Monte Carlo random variation. 



0.4 

0.3 

R 

0.2 
0.1 



o 

RN 



o 
G 



°A o 



1 2 3 

Normalized network length L 

Fig. 7. The normalized network length L and the route-length efficiency statistic R for certain networks on random points. 
The o show the beta-skeleton family, with RN the relative neighborhood graph and G the Gabriel graph. The • are special models: 
A shows the Delaunay triangulation, □ shows the network Q2 from Section 2.4 and shows the Hammersley network. 



£ x and £ y are independent with Exponential(l) dis- 
tribution. So mean edge-length equals 



(7) 



POO POO 

/ / Vx 2 + y 2 e~ x - y dx dy w 1.62. 
Jo Jo 



(e) For any triangulation, A = 6 in the infinite 
model. For the Delaunay triangulation, L = ES where 
S is the perimeter length of a typical cell, and it 
is known ([35], page 113) that ES = ^. Note [33] 
that the Delaunay triangulation is in general not 



the minimum-length triangulation. Our simulation 
results in Figure 6 for p(d) for the Delaunay tri- 
angulation are roughly consistent with a simulation 
result in [13] saying that p(65) ~ 0.05. 

4.1 A Simple Calculation for Proximity Graphs 

Let us give an example of an elementary calcula- 
tion for proximity graphs over random points. 



10 



D. J. ALDOUS AND J. SHUN 



Lemma 1. For a proximity graph with template 
A on the Poisson point process, 

r 3/2 

(8) L 



7T 



4c 3/2 : 



A = - 

c 



mean 



0) 

where c = area(^4) . 

Proof. Take a typical city at position xq. For 
a city x at distance s the chance that (xq,x) is an 
edge equals exp(— cs 2 ) and so 

poo 

degree = / exp ( — cs 2 ) 2irs ds , 
Jo 

1 f°° 

L=— y sexp(— cs 2 )2ixsds. 

Evaluating the integrals gives (8) and (9). □ 

One can derive similar integral formulas for other 
"local" characteristics, for example, mean density 
of triangles and moments of vertex degree. See [18, 
20, 21, 34] for a variety of such generalizations and 
specializations. 

4.2 Other Tractable Networks 

We do not know any other ways of defining net- 
works on random points which are both "natural" 
and are tractable in the sense that one can find ex- 
act analytic formulas for L. In particular, we know 
no tractable way of defining networks with deliber- 
ate junctions as in Figure 8. Note also that, while 
it is easy to make ad hoc modifications to the ge- 
ometric graph to ensure connectivity, these destroy 
tractability. On the other hand, one can construct 
"unnatural" networks (see, e.g., [8]) designed to per- 
mit calculation of L. 

5. OPTIMAL NETWORKS AND N oo 
LIMITS 

5.1 Tractable Models 

As mentioned earlier, the quantities L,A,R we 
discuss may be interpreted as exact values in the in- 
finite model or as n — > oo limits in the finite model. 
To elaborate briefly, in a realization of the finite 
model (n cities distributed independently and uni- 
formly in a square of area n), a network in Table 1 
has a normalized length L n = n _1 x (network length) 
and an average degree A n which are random vari- 
ables, but there is convergence (in probability and 
in expectation) 




Fig. 8. An ad hoc modification of the relative neighborhood 
graph, introducing junctions. 

to limit constants definable in terms of the analo- 
gous network on the infinite model (rate 1 Poisson 
point process on the infinite plane). For the proxim- 
ity graphs or Delaunay triangulation, the network 
definition applies directly to the infinite model and 
proof of (10) is straightforward. For the Hammers- 
ley network, (10) is implicit in [6], and for the MST 
detailed arguments can be found in [9, 43]. 

5.2 Optimal Networks 

We now turn to consideration of optimal networks. 
Given a configuration x of n cities in the area-n 
square, and a value of L which is greater than n" 1 x 
(length of Steiner tree), one can define a number 



(11) 



i?„(x, L) = min of R over all networks 

on x with normalized length < L, 



(10) 



L n — > L, A r 



A 



as n - 



oo 



where R is the discretized version (6) calculated us- 
ing intervals of some suitable length 5 n . Applying 
this to a random configuration X in the finite model 
gives, for each L, a random variable 

:= i?n(X, L). 

One intuitively expects convergence to some deter- 
ministic limit 

(12) E n (L) -> R op t(L) say, asn->oo. 

The analogous result for i? max will be proved care- 
fully in [8], and the same "superadditivity" argu- 
ment could be used to prove (12). See [43, 44, 47] for 
general background to such results. The point is that 
we do not have any explicit description of the opti- 
mal [i.e., attaining the minimum in (11)] networks in 
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the finite or infinite models, so it seems very chal- 
lenging to prove the natural stronger supposition 
that the finite optimal networks themselves converge 
(in some appropriate sense) to a unique infinite op- 
timal network for which the value R = R op t(L) is 
attained. 

5.3 The Curve R opt (L) 

Every possible network on the infinite Poisson point 
process defines a pair (L,R), and the curve R = 
-Ropt (L) can be defied equivalently as the lower bound- 
ary of the set of possible values of (L, R). There is no 
reason to believe that proximity graphs are exactly 
optimal, and, indeed, Figure 7 shows that the De- 
launay triangulation is slightly more efficient than 
the corresponding /3-skeleton. But our attempts to 
do better by ad hoc constructions (e.g., by introduc- 
ing degree-3 junctions — see Figure 8 for an example) 
have been unsuccessful. And, indeed, the fact that 
the two special models in Figure 7 lie close to the 
/3-skeleton curve lends credence to the idea that this 
curve is almost optimal. We therefore speculate that 
the function i? op t looks something like the curve in 
Figure 9, which we now discuss. 

What can we say about i? opt (L)? It is a priori 
nonincreasing. It is known [47] that there exists a 
Euclidean Steiner tree constant Lgx representing the 
limit normalized Steiner tree length in the random 
model, and clearly R opt (L) = oo for L < Lst- The 
facts 

Ropt(L) < oo for all L > Lst', 

(13) 

i?opt \L) — > as L — > oo 

are not trivial to prove rigorously, but follow from 
the corresponding facts for -R max proved in [8]. But 
we are unable to prove rigorously that R opt (L) is 
strictly decreasing or that it is continuous. 

6. FINAL REMARKS 
6.1 Toy Models for Road Networks 

The idea of using proximity graphs as toy models 
for road networks has previously been noted [30] but 
not investigated very thoroughly. It is an intuitively 
natural idea to a network designer: whether or not 
to place a direct road from city j to a nearby city 
j depends (partly) on whether some other city k is 
close to the line between them. 

As observed by a referee, for the kind of models 
studied in this paper we expect route length £(i,j) 



between distant cities to be roughly proportional to 
graph distance (number of edges), which is a more 
relevant quantity in some contexts. However, when 
one considers design of optimal networks, replacing 
or partially replacing route length by graph distance 
leads to quite different optimal networks [1 , 22] . For 
some other cost/benefit functionals leading to yet 
different optimal networks see [2, 14]. 

6.2 Rigorous Proof of Finite R in Random 
Proximity Graphs 

Table 1 presented the Monte Carlo numerical value 
~0.38 of R for the relative neighborhood graph on 
random points. From a rigorous viewpoint, the as- 
sertion that a random network has R < oo is es- 
sentially the assertion that p(d) = 0(d) as d — > oo. 
This is often nontrivial to prove. A general sufficient 
condition for this property, which applies to the rel- 
ative neighborhood graph (and hence all proximity 
graphs), is proved in [3]. The related fact that the 
limit lim^oo p{d)/d exists is proved in [4]. 

6.3 Real-World Trade-Off Between Network 
Length and Route-Length Efficiency 

Recall that our central theme is seeking to quan- 
tify the trade-off between normalized network length I 
and route-length efficiency R. Figure 9 suggests that 
for optimal networks the "law of diminishing re- 
turns" sets in around L = 2 (for comparison, this is 
the value of L corresponding to the square grid net- 
work), in that R opt (L) decreases rapidly to around 
0.13 as L increases to 2 but decreases only slowly 
as L increases further. This suggests a kind of "eco- 
nomic prediction" for the lengths of real-world net- 
works which are perceived by users to be efficient in 
providing short routes: 

the length of an efficient network linking n 
cities in a region of area A will be roughly 
2y/Jn. 

Here the y/ An arises from undoing the normaliza- 
tion and the "2" is the value of L. Of course, this is 
rough: we mean "closer to 2 than to 1 or 3." 

6.4 Other Results for the Random Network 
Models 

There is substantial literature on the networks 
(MST, proximity graphs, Delaunay triangulation) in 
the deterministic setting. In the random case, cen- 
tral limit theorems for total network length have 
been studied in many models: for the MST in [29, 31, 
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Fig. 9. Speculative shape for the curve R ov t(L), with o and • values from tractable networks in Figure 7. 



32], and for the Delaunay triangulation, Voronoi tes- 
sellation, relative neighborhood and Gabriel graphs 
in [12, 25, 42]. Large deviation estimates for to- 
tal network length are given for the Gabriel graph 
in [46], Section 11.4, and presumably could be ex- 
tended to other models. Otherwise the literature for 
the random case is rather diffuse, with different fo- 
cuses for different networks. For instance, work on 
MSTs has focused on connections with critical con- 
tinuum percolation [17]. For the relative neighbor- 
hood graph and the Gabriel graph, [20] calculates A 
and [18] shows that, in the finite model, in a certain 
range the /3-skeletons have 



(14) -Rmax grows as order -^/logn/log log 



n 



and [21] shows the same order for maximum vertex 
degree in the Gabriel graph. As for the Delaunay tri- 
angulation, there has been surprisingly little follow- 
up to the seminal analysis by Miles [35] (various 
maximal statistics are studied in [16]), though the 
closely related Voronoi tessellation has been studied 
in more detail [36]. 

6.5 Speculative Applications of Random 
Proximity Graphs 

Random proximity graphs seem an interesting ob- 
ject of study from many viewpoints, in particular, 
as an attractive alternative to random geometric 
graphs for modeling spatial networks that are con- 
nected by design. It is remarkable that results such 
as (14) are the only nonelementary results about 
them that we can find in the literature. As well as 



being natural models for road networks, proximity 
graphs might be useful in modeling communication 
networks suffering line of sight interference. 

At a more mathematical level, for questions such 
as spread-out percolation [41] or critical value of 
contact processes [15], random proximity graphs with 
small A are an interesting alternative to the usual 
lattice- or random graph-based models. For instance, 
it is natural to conjecture that the critical value p* A 
for edge percolation on a random proximity graph 
with template A satisfies 

(15) p* A ~ 7T _1 area(A) as area(^4) — > 

[the right side = 1/A from (9)] and that the criti- 
cal value for the contact process has the same 
asymptotics. 

6.6 Analogies Between Deterministic and 
Random Networks 

As mentioned earlier, we may make very loose 
analogies between particular networks on random 
points and particular deterministic networks in Fig- 
ure 4, based in part on exact equality of A in the 
latter three cases: 

Relative n'hood graph <R- punctured lattice, 
Gabriel graph f-> square lattice, 
Hammersley network <R- diagonal lattice, 
Delaunay triangulation f-> triangular lattice. 
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6.7 Scale Invariant Continuum Networks 

Introducing the statistic R can be viewed as one 
approach to resolving the "paradox" from [7], dis- 
cussed in Section 3.2, that the more natural statis- 
tic -Rave does not lead to realistic optimal networks 
in the n — > oo limit. This particular approach was 
prompted by visualizing real-world road networks — 
cf. discussion in Section 3.3. Let us mention a mathe- 
matically more sophisticated alternative, under study 
as a work in progress [5] . Instead of a discrete Pois- 
son process of cities, we imagine a continuum limit. 
That is, for each finite set (z±, . . . , Zk) of points in the 
plane, there is a random network S(z±, . . . , Zk) link- 
ing the points, consistent as more points are added. 
Mathematically natural structural properties for the 
distribution of such a process are as follows: 

(i) translation and rotation invariance, 

(ii) scale invariance, 

where the latter means that routes, as point-sets in 
M. 2 , are invariant in distribution under Euclidean 
scaling. This implies that the quantity p{d) anal- 
ogous to (5), assumed finite, is a constant, which we 
can call R' . The analog V of L is defined by 

the expected length of the network on 
n uniform random points in the area-n 
square grows ~ L'n as n—>oo. 

In this setting we can study the optimal trade-off 
between V and R' , and the kind of "paradoxical" 
Figure 5 network cannot arise because it violates 
scale-invariance. 
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